Quasi-parametric optical flow estimation

ABSTRACT

An image processing system includes a processor and optical flow (OF) determination logic for quantifying relative motion of a feature present in a first frame of video and a second frame of video that provide at least one of temporally and spatially ordered images with respect to the two frames of video. The OF determination logic configures the processor to implement performing OF estimation between the first frame and second frame using a pyramidal block matching (PBM) method to generate an initial optical flow (OF) estimate at a base pyramid level having integer pixel resolution, and refining the initial OF estimate using at least one pass of a modified Lucas-Kanade (LK) method to provide a revised OF estimate having fractional pixel resolution.

CROSS REFERENCE TO RELATED APPLICATIONS

This application is a Continuation of application Ser. No. 15/081,118filed Mar. 25, 2016, which claims priority to Indian Provisional PatentApplication No. 6508/CHE/2015 filed on Dec. 4, 2015, which is herebyincorporated herein by reference in its entirety.

CROSS-REFERENCE TO COPENDING APPLICATIONS

This application has subject matter related to copending applicationSer. No. 14/737,904 entitled “Optical flow determination using pyramidalblock matching” filed on Jun. 12, 2015.

FIELD

Disclosed embodiments relate to optical flow estimation including theuse of pyramidal block matching.

BACKGROUND

The observed motion of objects in sequence of images due to relativemotion between an optical sensor, such as a camera, and the objectspresent in the image is termed optical flow or optic flow. The termoptical flow is generally applied in the computer vision domain toincorporate related techniques from image processing and control ofnavigation, such as motion detection, object segmentation,time-to-contact information, focus of expansion calculations, luminance,motion compensated encoding, and stereo disparity measurement. Suchtechniques are of special interest in automotive driver assist systems,robotics, and other applications that apply machine vision.

Searching for the best matching patch between two arrays of image datais a needed step in image processing. For example, some stereoscopicimaging systems compute the disparity between left and right images byfinding a two-dimensional (2D) patch in the right image that bestmatches a given 2D patch in the left image. In another example, thealignment of two three-dimensional (3D) point clouds may be accomplishedby searching for the best 3D patch matches between the volumes. Inanother example, video compression algorithms may determine motionbetween two consecutive images using an optical flow algorithm whichmatches patches between the two images.

A coarse-to-fine resolution pyramid approach can be used for opticalflow algorithm matching. In general, in a pyramid approach, an initialsearch is performed at a lower resolution than the original images andthe initial search result is then refined at one or more higherresolutions. The number of resolution levels in the search pyramid isimplementation dependent. The use of a pyramidal search approach isgenerally faster and a ore tolerant to local minima as compared to anexhaustive search at high resolution.

Camera-based systems use a variety of computer vision (CV) technologiesto implement advanced driver assistance systems (ADAS) that are designedto increase driver's situational awareness and road safety by providingessential information, warning and automatic intervention to reduce thepossibility or severity of an accident. Governmental safety regulationsand independent rating systems are driving development and wideradoption of the ADAS where camera based systems are emerging as a keydifferentiator by original equipment manufacturers (OEMs). Camera-basedsystems are being widely adopted in ADAS for their reliabilityrobustness, ability to support various applications, and mostimportantly flexibility to support more and more ADAS applications infuture. The CV techniques represent a complex, high-performance, andlow-power compute problem, especially, the low level CV techniques thatextract high definition, high density depth (stereo) and motion (opticalflow) information from camera images.

SUMMARY

This Summary briefly indicates the nature and substance of thisDisclosure. It is submitted with the understanding that it will not beused to interpret or limit the scope or meaning of the claims.

Disclosed embodiments include image processors having optical flow logicwhich implements a quasi-parametric optical flow measurement (QP-OFM)algorithm which combines a pyramidal block matching (PBM) method, whichis a non-parametric approach, with a modified Lucas-Kanade (LK) method,which is a parametric approach, to obtain highly precise estimation overlarge optical flow (OF) range. The PBM method performs the OF estimationwith integer pixel resolution and then at least one pass of the modifiedLK method is used to refine the PBM obtained OF estimate to obtain arevised optical flow estimate with fractional pixel resolution. One passof the modified LK method generally provides a good cost benefit balanceas it does not need interpolation and a data re-fetch.

Disclosed embodiments include an image processing system that includes aprocessor and OF determination logic for quantifying relative motion ofa feature present in a first frame of video (e.g., a query image) and asecond frame of video (e.g., a reference image) that provide at leastone of temporally and spatially ordered images with respect to the twoframes of video. The OF determination logic configures a processor toimplement performing OF estimation between the first frame and secondframe using a PBM method to generate an initial OF estimate at a base(lowest) pyramid level having integer pixel resolution, and refining theinitial OF estimate using a modified LK method to provide a revised OFestimate having fractional pixel resolution. The QP-OFM algorithmsignificantly simplifies and improves upon known PBM-based dense OF(DOF) estimation algorithms. The Examples section described belowdemonstrates improved performance relative to known DOF estimationalgorithms.

BRIEF DESCRIPTION OF THE DRAWINGS

Reference will now be made to the accompanying drawings, which are notnecessarily drawn to scale, wherein:

FIG. 1 is block diagram for a system which implements a disclosed QP-OFMalgorithm that determines and applies an OF, according to an exampleembodiment.

FIG. 2 is a flow chart that shows steps in an example method of opticalflow measurement using a disclosed QP-OFM, according to an exampleembodiment.

FIG. 3 is a simplified block diagram of an example disclosed QP-OFMarrangement.

FIG. 4 illustrates a method for computing a 24 bit census descriptor forcentral pixel x considering its 24 neighbors (a 5×5 neighborhood) andtheir relationship, according to an example embodiment.

FIG. 5 illustrates 5×5 neighborhood pixels used in census signaturecomputation of a pixel.

FIG. 6 illustrates an example support window.

FIG. 7 shows an example predictor list including optical flow estimatesof the 4 neighbors of the top left pixel in the paxel.

FIG. 8 provides an illustration of predictor evaluation involving 5candidate predictors for current paxel and the winner predictor amongthem.

FIG. 9 shows a coarse-to-fine search pattern around the winnerpredictor.

FIG. 10A depicts an example SoC that provides high programmable computepower by offloading low level, computationally intensive and repetitiveCV tasks to a hardware accelerator (HWA).

FIG. 10B depicts implementation details of an example scheme of storingand handling the reference and current picture data in the localmemories within a HWA or on a programmable processor involving utilizinga combination of two levels of local memory with one memory for storingpicture data from the first frame of video and from the second frame ina growing window fashion, and another memory to store a sliding windowof the picture data from the second frame of video.

DETAILED DESCRIPTION

Example embodiments are described with reference to the drawings,wherein like reference numerals are used to designate similar orequivalent elements. Illustrated ordering of acts or events should notbe considered as limiting, as some acts or events may occur in differentorder and/or concurrently with other acts or events. Furthermore, someillustrated acts or events may not be required to implement amethodology in accordance with this disclosure.

Also, the terms “coupled to” or “couples with” (and the like) as usedherein without further qualification are intended to describe either anindirect or direct electrical connection. Thus, if a first device“couples” to a second device, that connection can be through a directelectrical connection where there are only parasitics in the pathway, orthrough an indirect electrical connection via intervening itemsincluding other devices and connections. For indirect coupling, theintervening item generally does not modify the information of a signalbut may adjust its current level, voltage level, and/or power level.

FIG. 1 shows a block diagram for a system 100 that determines andapplies OF in accordance with various disclosed embodiments. The system100 includes at least one image sensor 106, an image processor 102, anda control system 108. The image processor 102 can be a microprocessor,digital signal processor (DSP), or microcontroller unit (MCU). The imageprocessor 102 includes one or more processors and storage. The imagesensor 106 may include a charge coupled device (CCD), a complementarymetal oxide semiconductor (CMOS) image sensor, or other photodetectorfor converting light into electrical signals. The image sensor 106 mayinclude a plurality of photodetectors arranged in a two-dimensionalarray. The image sensor 106 may periodically capture an imagerepresentative of the field of view of the image sensor 106. Forexample, the image sensor 106 may capture 15, 30, 60, or any suitablenumber of images per second. The image sensor 106 may be incorporated ina digital video camera. Some disclosed embodiments may include multipleimage sensors 106, such as when using a plurality of image sensors 106.

The images captured by the image sensor 106 may be provided to the imageprocessor 102 as one or more arrays of binary values, where each binaryvalue may represent an intensity or color of light detected at aparticular photodetector of the image sensor 106 (i.e., a pictureelement (pixel)). Each image provided to the image processor 102 by theimage sensor 106 may be referred to as a frame. The image processor 102analyzes or manipulates the images 110 received from the image sensor106 to extract information from the images 110. The image processor 102includes OF determination logic (optical flow logic) shown in FIG. 1 as“optical flow” 104 that analyzes the images 110 received from the imagesensor 106 to measure optical flow of the various elements or featurespresent in the images 110. As noted above, disclosed image processors102 can be implemented by a HWA.

The optical floe logic 104 applies a disclosed QP-OFM algorithm which asnoted above combines the PBM method with a modified LK method to obtainaccurate and precise OF estimation over a large range (i.e., a varietyof different distance values). The PBM method performs the OF estimationwith integer pixel resolution and then at least one pass of the modifiedLK method is used to refine the obtained OF estimate to get a revised OFestimate having fractional pixel resolution. The refined optical flowestimate may be filtered using a post processing filter to filter outpotentially noisy estimates. In the below described HWA implementation,only one modified LK pass is generally used. However, in otherimplementations it is generally best to balance between computationcomplexity and benefit as using more number of iterations may needinterpolation and repeated fetching of pixel data from memory.

The PBM method as used herein refers to an OF estimation method whichconverts each of the frames of video into a hierarchical image pyramid.The image pyramid comprises a plurality of image levels. Imageresolution is reduced at each higher one of the image levels. For eachimage level and for each pixel in the first frame, a processor isconfigured to establish an initial estimate of a location of the pixelin the second frame to a predefined value or to a candidate positionthat minimizes a cost function, and to apply a plurality of sequentialsearches, starting from the initial estimate that minimizes the costfunction and establishes a refined estimate of the location of the pixelin the second frame.

The LK method as used herein refers to a modified version of adifferential OF estimation method developed by Bruce D. Lucas and TakeoKanade. The LK method assumes that the optical flow is essentiallyconstant in a local neighborhood of the pixel under consideration, andsolves the basic optical flow equations for all the pixels in thatneighborhood, by the least squares criterion. By combining informationfrom several nearby pixels, the LK method can resolve the inherentambiguity of the optical flow equation.

The LK method can solve the OF equation defined for a pixel in the imageusing the gradient information from the neighborhood of that pixel andthe least squares principle, using the following relation:

$\begin{bmatrix}U \\V\end{bmatrix} = {\begin{bmatrix}{\sum{\sum{{Gx} \circ {Gx}}}} & {\sum{\sum{{Gx} \circ {Gy}}}} \\{\sum{\sum{{Gx} \circ {Gy}}}} & {\sum{\sum{{Gy} \circ {Gy}}}}\end{bmatrix}^{- 1} \times \begin{bmatrix}{\sum{\sum{- {{Gx} \circ {Gt}}}}} \\{\sum{\sum{- {{Gy} \circ {Gt}}}}}\end{bmatrix}}$where the horizontal and vertical flow components are U and Vrespectively, Gx, Gy the horizontal and vertical spatial gradientscomputed over for the first frame, and Gt is the temporal gradientcomputed between the two images between which the OF is estimated. Theconventional LK method computes the OF pixel by pixel in each iterationafter performing warping (motion compensating) of the second image. Gx,Gy are defined for all pixels over a n×n (n=3, 5, 7 etc.) neighborhoodof a pixel and ‘∘’ operator defines the element wise multiplication ofthe 2D vectors. If the 2×2 matrix in the above equation is noninvertible, that is if its determinant is zero, it can be regularized orU and V may be set to zero.

Regarding the modified LK method used by disclosed QP-OFM algorithms,the inputs are the same Gx, Gy and Gt as described above but arecomputed during a step search process (described below) to obtainhorizontal and vertical flow components U and V respectively. The U andV flow components being real numbers are represented in fixed pointrepresentation, such as with 4 bits assigned for storing the fractionalpart. These flow values represent the incremental refinement that anexisting OF estimate F₁ ^(int) (obtained by PBM method at integer pixelresolution) referred to as “an initial OF estimate” in method 200described below, which undergoes processing to achieve a “revised OFestimate” in method 200 having higher accuracy with fractional pixelresolution. The values of U and V can be clamped within [−1, 1] rangebefore they are added into the existing flow estimates to obtain thefinal flow output F₁. This operation can be represented in equation formas:F ₁ =F ₁ ^(int)+clamp((U,V),−1,1)In one implementation, U and V computation followed by clamp operationin fixed point (including the division operation involved in the inversematrix computation using an adjugate of the matrix) can be implementedusing only simple integer arithmetic operations of multiplication,addition and comparison as shown below.

$U = \left\{ {{\begin{matrix}0 & {{{if}\mspace{14mu} D} = {{0\mspace{14mu}{or}\mspace{14mu}{D}} > {16 \times {N_{U}}}}} \\1 & {{{if}\mspace{14mu}{N_{U}}} \geq {D}} \\{\underset{fF}{argmin}{{{N_{U}} - {{fF} \times {D}}}}} & {Otherwise}\end{matrix}{Sign}_{U}} = \left\{ {{\begin{matrix} - & {{{if}\mspace{14mu}{{sign}\left( N_{U} \right)}} \neq {{sign}(D)}} \\ + & {otherwise}\end{matrix}V} = \left\{ {{\begin{matrix}0 & {{{if}\mspace{14mu} D} = {{0\mspace{14mu}{or}\mspace{14mu}{D}} > {16 \times {N_{V}}}}} \\1 & {{{if}\mspace{14mu}{N_{V}}} \geq {D}} \\{\underset{fF}{argmin}{{{N_{V}} - {{fF} \times {D}}}}} & {Otherwise}\end{matrix}{Sign}_{V}} = \left\{ {{\begin{matrix} - & {{{if}\mspace{14mu}{{sign}\left( N_{V} \right)}} \neq {{sign}(D)}} \\ + & {otherwise}\end{matrix}{Where}D} = {{{\sum{\sum{{G_{x} \circ G_{x}} \times {\sum{\sum{G_{y} \circ G_{y}}}}}}} - {\left( {\sum{\sum{G_{x} \circ G_{y}}}} \right)^{2}N_{U}}} = {{{- {\sum{\sum{{G_{y} \circ G_{y}} \times {\sum{\sum{G_{x} \circ G_{t}}}}}}}} + {\sum{\sum{{G_{x} \circ G_{y}} \times {\sum{\sum{{G_{y} \circ G_{t}}N_{V}}}}}}}} = {{{\sum{\sum{{G_{x} \circ G_{y}} \times {\sum{\sum{G_{x} \circ G_{t}}}}}}} + {\sum{\sum{{G_{x} \circ G_{x}} \times {\sum{\sum{{G_{y} \circ G_{t}}{and}{fF}}}}}}}} \in \begin{Bmatrix}{0.9375,} & {0.8750,} & {0.8125,} & {0.7500,} & {0.6875,} & {0.6250,} & {0.5625,} & {0.5000,} \\{0.4375,} & {0.3750,} & {0.3125,} & {0.2500,} & {0.1875,} & {0.1250,} & 0.0625 & \;\end{Bmatrix}}}}} \right.} \right.} \right.} \right.$All possible values of the U and V can be represented in appropriatefixed point representation with 4 bits of fractional part.

The disclosed modified LK method removes the above-described imagewarping used in the known LK method. The image warping step is removedby using one pass, assuming that the OF is essentially constant in alocal neighborhood and equal to the pixel under consideration.Significantly, this allows reuse of the pixel data fetched during stepsearch process to be used for Gt computation which is especially usefulas random accesses to the memory can be avoided. As described above, themodified LK method also splits the compute into sub-tasks of spatialgradient computation and the remaining of the operations.

The revised OF measurements may be filtered using a post processingfilter (see post processing filter 306 in FIG. 3 described below) toobtain the final output OF estimates 112 generated by the imageprocessor 102 may be provided to the control system 108. The controlsystem 108 may apply the revised OF measurements 112 to control themotion of the system 100, to present motion information to a user of thesystem 100, etc. For example, if the system 100 is an automotive driverassist system (ADAS), then the control system 108 may apply the revisedOF measurements 112 to determine whether a vehicle should change speedand/or direction based on the relative motion of the vehicle and objectsdetected by the image processor 102. In some ADAS implementations, thecontrol system 108 may autonomously change vehicle speed and directionbased, at least in part, on the revised OF measurements 112, while inother embodiments the control system 108 may, based on the revised OFmeasurements 112, provide alerts to an operator of the vehicleindicating that changes in speed and/or direction may be advisable.Similarly, in robotics, and other motion control applications, thecontrol system 108, may control movement (speed and/or direction) of anelement of the system 100 based on the revised optical flow measurements112.

FIG. 2 is a flow chart that shows steps in an example method 200 of OFmeasurement using a disclosed QP-OFM, according to an exampleembodiment. In this example the modified LK method is applied only atthe base level of the pyramid. Step 201 comprises acquiring a firstframe of video (e.g., a reference image) and a second frame of video(e.g., a query image) that provide at least one of temporally andspatially ordered images. Optionally, stored historic flow estimates orauxiliary estimates derived from parametric model of the imagevelocities may also be provided as prior evidence of the imagevelocities to be used as predictors during PBM process. In typicaloperation the QP-OFM algorithm estimates motion as instantaneous imagevelocities (pixel motion) when the inputs given are two temporallyordered images, optionally along with prior/auxiliary flow estimateswhich provides evidence of the image velocities.

Step 202 comprises using a processor implementing a stored QP-OFMalgorithm which performs steps 203 and 204. Step 203 comprises opticalflow estimating between the first frame and second frame using the PBMmethod to generate an initial obtained flow estimate having integerpixel resolution. The PBM method starts the optical flow estimation atthe highest pyramid level using spatial predictors and a step searchmethod (described below) based block matching (BM) process thatminimizes a cost function value to obtain accurate motion estimation.These optical flow estimates are then filtered (e.g., using a 2D 5×5median filter), appropriately scaled up and then refined (again usingspatial predictors and step search method) further at each lower pyramidlevels sequentially until the base pyramid level. No filtering orscaling of the OF estimates post the BM process at the base layer isneeded. In one embodiment the suitable cost function is a Hammingdistance over the binary feature descriptors.

At the base pyramid level the predictor configuration can be altered toinclude temporal predictors or auxiliary predictors, but the step searchcan remain the same. At base pyramid level, once the PBM processcompletes, a pass of the modified LK step (step 204 described below) isperformed to refine the obtained flow estimates to obtain a revisedoptical flow estimate having fractional pixel resolution enabling thedetermination of precise pixel motion. According to another embodiment apass of the modified LK method may also be applied while processinghigher pyramid levels after the BM process.

In order to reduce computations and data bandwidth, instead of BMprocessing each pixel, at each pyramid level the BM process can work ona 2×2 paxel (a 2×2 block of neighboring pixels) granularity wheredifferent computational blocks such as the predictor evaluation and stepsearch can work on 4 pixels simultaneously. This method can also beapplied for all possible paxel configurations including 1×1 (=a pixel),3×3, 2×4 etc., although experiments have found a 2×2 paxel to be bestfor desired quality and HWA efficiency. In such a scheme the predictorsare evaluated only for one representative pixel in the paxel to decidethe best estimate (winner predictor) of the OF, which is thenindependently refined for all pixels in the group during the step searchprocess.

The flow post processing involves use of a 2D median filter (see flowpost processing filter 306 in FIG. 3 described below), such as having asize 5×5 (height×width). The post processing block takes as input the OFestimates (Fi) and generates the filtered OF output (F). Alternatively apseudo 1D separable median filter, such as of a size 9×9 can also beused. The 9×9 filter can be separated into 3×9 and 9×3 filters appliedsequentially in that order (or can be reversed). The height and width ofthe filter is configured so that number of flow samples (27) used inmedian computation is small yet provides high quality post processing.In this case to filter the boundary pixels border extension of flowfield on all sides by 4 pixels is performed. The confidence assignmentblock (see confidence assignment block 307 in FIG. 3 described below)can use the cost function values computed during the PBM step and localvariability of the obtained flow estimates in a learning based frameworkto assign a confidence level for each pixel.

Step 204 comprises refining the initial PBM obtained flow estimate usingat least one pass of the modified LK method to provide a revised OFestimate having fractional pixel resolution. As described above themodified LK method modifies the known LK algorithm to remove theconventional image warping step and splits the compute steps into twogroups of gradient computation (spatial and temporal) operations and theremaining operations. This LK step thus reuses data fetched by PBMmethod to perform fractional pixel refinement. The LK step can usereduced precision gradient data for the OF refinement. The LK step canalso only use integer arithmetic to perform compute tasks includingmatrix inversion.

As described above, the method can optionally include post processingfiltering (see flow post processing filter 306 in FIG. 3 described belowwhich filters flow vectors output by the QP-OFM algorithm block 304).For example, pseudo-1d separable median filters can be used to removeimpulsive noise from the estimated flow vectors.

An example QP-OFM arrangement 300 is presented as a simplified blockdiagram shown in FIG. 3 including inputs 310 to an OF calculation block302 which includes a QP-OFM algorithm block 304, where the OFcalculation block 302 provides outputs 312, according to an exampleembodiment.

The inputs 310 are generally two grayscale images. One can also use anyone of the chroma channel information represented as one binary valueper pixel. A query image is shown as image 1 and a reference image shownas image 2, between which the OF is to be measured/estimated by theQP-OFM scheme arrangement. Prior/auxiliary optical flow estimates areshown provided as other inputs. Such prior estimates are temporallypreceding the current optical flow estimate and are generally providedat the resolution same as that of the input images (image 1 and image2), and have an expected density of 100% that is estimated for eachpixel or paxel.

The Image Pyramid Generation Block (shown as pyramid generation) 304 areceives inputs 310 including grayscale images (I) comprising image 2and image 1, a programmed number of pyramid levels N and sigma value(Sg) (e.g., from a processor) to be used to derive a Gaussian filterkernel. Image Pyramid Generation Block 304 a outputs an image pyramid ofN levels multi resolution representation of image such that every n^(th)image in the pyramid is half the resolution in both horizontal andvertical dimensions of the image at (n−1)^(th) level. Image PyramidGeneration Block 304 a generally provides preprocessing. If input imagehas dimensions W×H and the image pyramid will have N levels, the imagecan be padded on bottom and right, such that resultant image dimensionsW′×H′ are multiple of 2^((N-1)) or 2^((N)) in some cases. The paddedpixels can have value 0. Let the padded image be called I′.

An example process for Image Pyramid Generation Block 304 a is describedbelow:

  Let P_(i) represent the image in the i^(th) pyramid level.Initialization: P₁ = I', derive 2D Gaussian filter kernel G of size 5×5FOR (i = 2; i<=N; i++) % i represents the level in image pyramid  Filter P_((i-1)) with the Gaussian filter kernel G to obtain filteredimage P'_((i-1))   Obtain a scaled down image P_(i) by choosingalternate pixels of P'_((i-1)) in both the   directions END

Besides pyramid generation block 304 a, QP-OFM algorithm block 304 alsoincludes a block matching optical flow estimation block 304 b(implementing the block matching, filtration and scaling functionalityinvolved in the PBM process (step 203) and a differentialtechniques-based local method of optical flow estimation block 304 c(for implementing the modified LK (step 204). Blocks 304 b and 304 cprocess one pyramid level at a time starting from top of the pyramid.Block 304 c is enabled only for base of the pyramid in one embodiment,and at all levels in another embodiment. Block matching optical flowestimation block 304 b provides an OF estimate with integer pixelprecision. Optical flow estimation block 304 c provides a revised OFestimate with fractional pixel resolution. The filtration and scalingfunctionalities of the Block 304 b can be disabled at base of thepyramid for the embodiment where Block 304 c is enabled only for base ofthe pyramid. In the embodiment when Block 304 c is enabled at all levelsthe filtration and scaling functionalities of Block 304 b are disabledfor base of the pyramid and for the higher levels these functionalitiesare used after revision (by Block 304 c) of the OF estimates (obtainedby Block 304 b).

As described above, inputs to the pyramid generation block 304 a includetwo grayscale images (query image and reference image) for which opticalflow is to be estimated and historic image velocity estimates ortemporal predictors (all of dimensions W′×H′ pixels). The temporalpredictors are an optional input and their multiscale representation isnot obtained by the pyramid generation block 304 a, it is shown with thedotted path in FIG. 3

QP-OFM algorithm block 304 outputs a flow estimate comprisingappropriately filtered and scaled flow vectors for next pyramid level(F_(i-1)) or refined flow estimates (F_(i)) if i==1 (i.e. baseresolution). The QP-OFM algorithm block 304 output for levels higherthan the base level is consumed within QP-OFM algorithm block 304 and atthe base level is passed on to the flow post processing block 306 and tothe confidence assignment block 307. As disclosed above, for levelsother than base level, the flow can be filtered with a simple 2D medianfilter, with a flow estimate resolution increased to match that of lowerlevel and flow estimates scaled with appropriate scale factor. It may befollowed by a rounding operation when a pass of modified LK is appliedat all pyramid levels. For base level the QP-OFM algorithm block 304output is passed on to flow post processing block 306, such for postprocessing by pseudo-1D separable median filters and to 307 forconfidence assignment.

Pre-computation processing can also be provided prior to QP-OFMalgorithm padding (toward bottom and right) so that all levels haveinteger dimensions and exactly the same (e.g., 0.5) scale factor.Initialization can comprise flow estimates (F_(N)) to 0 for level N.

An example process provided by QP-OFM algorithm blocks 304 b and 304 cfor a pyramid level is described below. The process described is for apyramid level except the binary census transform step, but the algorithmblock 304 b may also provide the functionality of preparing binary pixeldescriptors by binary census transform for the query image and referenceimage.

FOR (All 2×2 paxels in the query image) % Processed in raster scan order

-   -   Prepare a list of predictors, for the top left pixel in the        paxel    -   Compute a median of the horizontal and vertical components of        the predictors.

Let the resultant flow vector be called the median predictor.

-   -   Find the best predictor (winner predictor) that minimizes the        Cost function value in reference image    -   Set winner predictor as the current optical flow estimate for        all 4 pixels in the current 2×2 paxel

FOR (All pixels in the current 2×2 paxel in the query image)

-   -   3-1 Search: perform in plane coarse-to-fine (skip pel-3+skip        pel-1) block matching search around the current optical flow        estimate minimizing the Cost function value.        -   During skip 1 search if LK step is enabled for current level            Compute horizontal (Gx) and vertical (Gy) image gradient            data for query image along with temporal (Gt) gradient using            query image and motion compensated reference image. If these            computations are performed during skip pel-1 search, LK step            does not require any data fetch from memory for motion            compensation.            -   Note: Gradients are computed using gray level pixel                information and not census transforms data.    -   The modified LK Step (if LK step is enabled for current level)        -   Using the image gradient data Gx, Gy and Gt to solve for            change in the optical flow estimates        -   Clamp change within [−1, 1] range in both horizontal and            vertical directions        -   Update the optical flow estimates obtained during step            search process by adding flow values computed in this step.

    END END IF (i ≠ 1) i.e not the base level     Obtain flow estimatesfor next pyramid level     Input: Flow estimates for current level(F_(i))     Output: Flow estimate for next pyramid level (F_(i-1))    Process:      - Perform 2D median filtering of the updated flowestimates F_(i)      - Upscale the flow estimate resolution      - Scalethe flow values using appropriate scale factor and round off to nearest    integer value ELSE     Input: Flow estimates at fractional pixelprecision (F₁)     Output: Post processed optical flow (F)     Process:     - Perform post filtering of the obtained flow estimates F₁ to getoutput flow F     Crop F to original image dimensions END

As noted above, OF calculation block 302 is also shown having a flowpost processing filter 306 that receives unfiltered flow vector datafrom QP-OFM algorithm block 304 and outputs filtered flow vector data.OF calculation block 302 is also shown having a confidence assignmentblock 307. Confidence assignment block 307 receives Hamming distance(HD) costs generated during optical flow estimation processing andunfiltered flow vectors from QP-OFM algorithm block 304, and generatesconfidence data.

Flow post processing filter 306 is shown outputting filtered flow vectordata that has horizontal and vertical flow components which provides adense OF estimate output 312 b. The confidence data assigned is aconfidence level for each pixel by confidence assignment block 307 isshown provided to the confidence map 312 a. The dense optical flowestimates 312 b are generally provided at the resolution same as that ofthe input images (image 1 and image 2), with an expected density of100%, a bit resolution per pixel per flow component of 16 bit fixedpoint (7 and 8 integer bits for vertical and horizontal flow componentsrespectively and 4 fractional bits for both of them).

Computational details and illustrations for various QP-OFM algorithmsteps are now described. Regarding Gaussian filtering processing inpyramid generation block 304 a, a Gaussian filter of size 5×5 can beused to filter pyramid level images such that immediately upper pyramidlevel image can be generated by dropping alternate pixels from thefiltered image. A Gaussian Kernel derivation is provided below.

In 2-D (x,y), an isotropic Gaussian kernel of size 5×5 has the followingform:

${G\left( {x,y} \right)} = {\frac{1}{2\pi\;{Sg}^{2}}e^{- \frac{{({x - 2})}^{2} + {({y - 2})}^{2}}{2{Sg}^{2}}}}$Where Sg is the distance sigma value used to derive Gaussian filterkernel, and (x,y) represents a row and column location of a filtercoefficient in the Gaussian kernel. In a 5×5 filter kernel possible rowand column positions are in range [0,4]. This scheme uses a Sg value of1.61. The filter coefficients thus obtained are real numbers in fixedpoint representation. Regarding the Filtering process (e.g., implementedby a separate FIR filter), at any pyramid level ‘n’ the Gaussianfiltered image P′_(n), is obtained by convolving the image P_(n) with aGaussian mask G as defined in the above section.P′ _(n) =G*P _(n)During the convolution process the boundary pixels are filtered byperforming a border extension of 2 pixels on each side.

The binary features-based coarse-to-fine block matching optical flowestimation block 304 b can also perform a binary census transform. Amethod for computing a 24 bit census descriptor for central pixel xconsidering its 24 neighbors (a 5×5 neighborhood) and their relationshipwith it is illustrated in FIG. 4. A bit corresponding to a neighborhoodpixel is set to 1 if it has the grayscale value greater than or equal tothe pixel x, otherwise the bit is set to 0. FIG. 5 illustrates 5×5neighborhood pixels used in census signature computation of a pixel.While computing Census transform for the border pixels border extensiontechnique is used to replace missing pixels. The square box in middle ofthe shaded region is the central pixel for which census transform isbeing computed.

Regarding a cost function, as described above, the PBM method uses abinary census transform for pixel description and a Hamming distance asthe Cost function during the predictor evaluation and step search. Whileevaluating cost function value at a predicted/search position, a squareneighborhood (support window) of 9×9 around the current pixel (see FIG.6 described below for an illustration of a support window) in queryimage and same sized neighborhood around the predicted/search positionin the reference image is considered. These neighborhoods act as querydata and reference data which are compared bitwise to find discordance.In order to do this, the binary signatures of the 81 pixels in query andreference data are concatenated in to two binary vectors. If Qv and Rvrepresent these vectors then the binary Hamming distance between them iscomputed as:HD=bitcount(

v⊕Rv)Where ⊕ defines the bitwise exclusive-or operation between two binaryvectors. The bitcount operation calculates the number of bits set to oneafter the bitwise exclusive-or operation and output HD is the Hammingdistance between the two data. Predictor list preparation may also beused. Predictors are the existing OF estimates for spatial, temporalneighbors that can be used as initial OF estimates for the blockmatching search based OF estimation process for a pixel. The OFestimation algorithm may use 5 different predictors per paxel. Theseinclude OF estimates of the 4 neighbors of the top left pixel in thepaxel, which are labeled in FIG. 7. These are the pixels that precedethe top-left pixel when traversing the image in raster scan order fromleft to right and top to bottom.

Accounting for the pipelined design of HWA, the left predictor isexpected to come from the left neighbor of the top-left pixel in thepaxel where OF estimate for the pixel derived in higher pyramid leveland appropriately scaled for current pyramid level is the predictorvalue. This predictor is called as pyramidal-left predictor. Inalternate designs left predictor value may come from a pixel whose OFestimate has been refined in the current pyramid level and lies to theleft of the top-left pixel at some pixel distance. This predictor can becalled as delayed-left predictor. Fifth predictor is the existing OFestimate for the current pixel, which is derived in higher pyramid leveland appropriately scaled for current pyramid level. At the base pyramidlevel the pyramidal-left/delayed-left is disabled and replaced withtemporal predictor when it is enabled. Predictor usage is summarized inthe Table 1 below and relative pixel locations (with respect to top-leftposition in the paxel) providing top and left spatial predictors aredepicted in FIG. 7.

TABLE 1 Summary of example predictor usage according to the pyramidlevel Pyramid Level User configuration Predictors used Base pyramidTemporal predictor Top-right, top, top-left, pyramidal- level enabledco-located and temporal predictor Temporal predictor Top-right, top,top-left, pyramidal- disabled co-located, pyramidal-left/delayed- leftOther than Not applicable Top-right, top, top-left, pyramidal- basepyramid left/delayed-left and pyramidal-co- level located predictor

It is noted temporal predictors can be in fractional-pel resolution,thus it is rounded off to nearest integer location. Other spatialpredictor values are obtained during the step-search for refinement ofOF estimates for corresponding pixels or from the higher pyramid levelestimate. They are commonly of integer pixel resolution but in case ofusing LK step at each pyramid level they can be in fractional pixelprecision and should be rounded off to nearest integer. The toppredictors are generally not available for top row of the picture andtop-left and left predictors are not available for left border pixelsand top-right predictor is not available on right boundaries. In thisscenario unavailable predictors should generally be replaced with eitherpyramidal-co-located predictor or set to (0,0) value. When usingpyramidal pyramidal-co-located predictor for this purpose at highestpyramid level it takes the value (0,0).

A computation can be made for the median predictor. The median predictorto be used in the search process is computed by independently findingmedian of the horizontal and vertical components of the selectedpredictors.

Regarding selection of the best predictor, considering an example 9×9support window around the top left pixel in the paxel, the cost functionvalue is estimated at all the predicted positions in the reference imageand predictor that leads to minimum Cost function value selected as thewinner predictor. In case the Cost function values are same for morethan one predicted positions the first predictor in the evaluation orderthat leads to the minimum value is chosen as the winner predictor. Anexample evaluation order is 1) pyramidal-current, 2)pyramidal-left/delayed-left/temporal, 3) top-right, 4) top and 5)top-left. The predictor evaluation for the top-right, top, top-leftpredictors can be merged with step search for corresponding pixellocations in the top row to minimize the data read from memory.

FIG. 8 provides an illustration of predictor evaluation involving 5candidate predictors for current paxel and the winner predictor amongthem. The winner predictor now is set as current OF estimate for allpixels in the paxel and further refined by step search process explainedbelow.

During the step search process in-plane coarse-to-fine search can beperformed to refine the current OF estimates such that Cost functionvalue is minimized. The cost function used during step search is definedas HD+MVCost. Where MVcost is defined as product of the motionsmoothness factor (λ=24 as a sample value) and vector distance (cityblock or sum of absolute differences of horizontal and verticalcomponents) between the search point and the median predictor. Thesearch is performed in a specific pattern such that computationalcomplexity is minimized while providing wider refinement range.

Stage-1 (Skip-3)

In this stage the skip-3 search over 9 locations (3×3 grid) locationscentered on the winner predictor is performed. The search points are at3 pixel distance from the current optical flow estimate or winnerpredictor position. In FIG. 8 which illustrates the 3-1 step searchprocess and the coarse-to-fine search paths, these 9 locations aredepicted by the dotted pattern pixels for top left pixel in the paxel.The nine candidate positions are evaluated in raster scan order,starting from the top left position in the grid. In case the Costfunction values are same for more than one search positions the firstsearch position in the evaluation order that leads to the minimum valueis chosen as the position and OF is estimated accordingly.

Stage-2 (Skip-1)

In this stage the OF estimate obtained in previous stage is refined bysearching over 9 points in a 3×3 neighborhood (Skip-1) marked by randomfill pattern. The winner of Stage-2 gives best integer pixel motionestimate around the winner predictor. In case the Cost function valuesare same for more than one search positions the first search position inthe evaluation order (raster scan) that leads to the minimum value beingchosen as the winner position and the OF estimate is updatedaccordingly.

Regarding the search range restriction, at any pyramid level when thecandidate position (during predictor evaluation or step search) at whichthe Cost function value is to be evaluated is at distance larger than athreshold value called search range in horizontal (191 pixels) orvertical (63 pixels) directions, then corresponding search position canbe ignored. If all candidate positions during predictor search step areat distance larger than the search range, then current optical flowestimate can be set to (0,0). This can be achieved by setting apyramidal co-located predictor value to (0,0).

The coarse-to-fine search pattern around the winner predictor isillustrated in FIG. 9. FIG. 9 illustrates example search paths for 3-1search for two top pixels in the paxel, the refined OF for one top-leftpixel in the paxel, along with the refinement range provided by stepsearch block (different patterned pixels depict the possible outcomesfrom step search refinement for the top-left pixel) FIG. 9 alsoillustrates that OF estimates for the two pixels at beginning of thestep search process can be same but during step search process their OFestimates can diverge.

Regarding pre-evaluation of the predictors for the lower row of paxels,at any pixel during step search process, for the paxels to which thepixel provides one of the top predictor values, the predicted position(OF for the pixel after the Skip-1 search) is evaluated using alreadyfetched data and cost is stored for later use. As a result, repeateddata read during the predictor selection step is not necessary.

Regarding computation of spatial and temporal gradients used in LK stepwhen the LK step is enabled for a level, these computations areperformed for all pixels during step search process to avoid a repeateddata read by the LK step block. In his modified LK step, spatial andtemporal image gradients are computed (using grayscale pixel values) forall pixels locations in the 3×3 neighborhood of the current pixel in thequery image.

Computation of Spatial Gradients

The horizontal spatial gradient (Gx) and vertical spatial gradient (Gy)are computed using central difference operator (mask=[−1 0 1]) thusneeding pixels within 5×5 neighborhood of the current pixel. At imageboundaries a border extension (2 or 1 pixels as necessary) is used tocompute the spatial gradients.

Computation of Temporal Gradients

The temporal image gradient (Gt) all pixels in the 3×3 neighborhood iscomputed considering neighborhood of current pixel in query image andsame sized neighborhood of the location to which it is estimated to havemoved in the reference image (this is the winner position obtainedduring skip one search). Pixel-wise Gt computation is performed bysubtracting the reference image pixel value from corresponding pixelvalue in query image. At image boundaries border extension (1 pixels) isused to compute gradients.

Gradient Precision

For 12 bit input images the spatial and temporal gradient values are of13 bit precision. This precision is reduced to 10 bits by performing 2bit right shift and clamping post shift values between −512 and 511.

The PBM method described above estimates pixel level OF at integer pixelresolution and this estimate is generally close to the actual pixelmotions. However, it is refined further to more accurately match thelater. As described above, to refine the PBM generated initial OFestimates to provide fractional pixel accuracy (“revised OF estimate”),the QP-OFM algorithm uses at least one pass of a modified LK method.

Regarding 2D median filtering of the flow estimates, after updating allthe flow estimates at a pyramid level other than base pyramid level theycan be filtered using a 2D median filter of size 5×5. At imageboundaries border extension (2 pixels) is generally used to computemedian values. Regarding flow resampling, post 2D median filtering, forall pyramid levels other than base level, the flow estimates for nextlower pyramid level are obtained by up-scaling the flow estimateresolution using nearest neighbor interpolation and scaling the flowvectors by the resolution upscale factor (2).

Regarding flow post-processing, if enabled by a user throughconfiguration, the flow post processing in the QP-OFM algorithm involvesuse of 2D median filter, such as of size 5×5. The post processing block306 takes as an input the optical flow estimates (Fi) from QP-OFMalgorithm block 304 and generates the filtered optical flow output (F)shown as filtered flow vector data. For filtering the boundary pixelsborder extension of flow field on all sides by 2 pixels is performed.Alternatively a pseudo H) separable median filter of size 9×9(height×width) can also be used. The 9×9 filter is separated into 3×9and 9×3 filters applied sequentially in that or reverse order. Theheight and width of the separated filters is configured so that numberof flow samples used in median computation is small yet provides highquality post processing. In this case to filter the boundary pixelsborder extension of flow field on all sides by 4 pixels is performed.

Due to relatively low computational requirements compared to someconventional approaches such as LK and Horn Schunck (HnS), it ispossible to achieve higher resolution and frame rates in real time thanpreviously possible.

Some other specific differences disclosed QP-OFM algorithms over othersolutions, such as Hierarchical Model-Based. Motion Estimation approachdisclosed by Bergen, Anandan, Hanna, and Hingorani, include:

-   -   The form of coarse-to-fine search using predictors and motion        smoothness factors;    -   Use of (optional) historic image velocity estimates to improve        optical flow estimation accuracy;    -   Reduction of the predictor evaluation requirements per pixel by        opting for paxel (2×2 block of pixels) order of processing and        using same set of predictors for entire paxel;    -   Reduction of the random memory accesses for predictor evaluation        by pre-evaluation of the top-right, top and top-left predictors        during step search refinement for those pixels;    -   Removal of interpolation from the algorithm entirely by using        combining block matching method with LK step to estimate the        fractional pixel motion and achieving large saving in        computational requirements;    -   Pre-computation of the spatial gradient and temporal gradients        during block matching process;    -   Use of pseudo-1d separable filters, and    -   Ability to balance the computational and optical flow accuracy        requirements by changing number of predictors and search        resolution.

Advantages of disclosed QP-OFM algorithms include:

-   -   Using only computationally simple optimization function ‘Hamming        distance’ between binary feature descriptors over a support        window to perform the search.    -   Saving precious resources by using common set of predictors for        a group of pixels to minimize random access to the memory and a        pattern of non-exhaustive coarse-to-fine search strategy for        optical flow estimation to minimize the computations.    -   Pre-evaluation of the top predictors saves the memory to        computational block data bandwidth (fetching reference pixel        data for cost function evaluation) and thus minimizes the design        complexity    -   Pre-computation of the spatial gradient and temporal gradients        used in modified LK step further saves the memory block data        bandwidth (e.g. fetching reference pixel data for temporal        gradient computation) and thus minimizes the design complexity    -   As part of the design complexity reduction and DDR bandwidth        reduction (described below) two levels of local memories can be        used with L2 to store the current and reference picture data in        growing window fashion and L1 to store sliding window of the        reference picture data to store pixels needed to enable block        matching search and LK step over the desired range.    -   By combining the parametric (PBM) and nonparametric (LK)        approaches to define a new algorithm a large number of        interpolation operations needed for fractional pixel motion        estimation is removed, which is a significant advantage over        existing algorithms.    -   Even the operations involved in the LK step involve integer        arithmetic and hence require only constant number of        multiplication and addition/subtraction operations to enable        computation of highly precise fraction pixel motion estimation.

Disclosed QP-OFM algorithms can be implemented as System-on-Chip (SoC)solutions that provide higher programmable compute power, by offloadinglow level, compute intensive and repetitive CV tasks to a HWA. The OFtechniques which form the basic building block of the camera-based ADASprocesses large amounts of pixel data for every frame to generate highlevels of information is recognized to be a good candidate for a HWAdesign.

FIG. 10A depicts an example SoC 1000 formed on a substrate 1005 (chip)having a semiconductor surface (e.g., silicon substrate) that provideshigh programmable compute power by offloading low level, computeintensive and repetitive CV tasks to a HWA 1015. SoC 1000 includes ageneral-purpose processor (GPP) 1010 that performs general computervision and signal processing and control tasks. HWA 1015 performs lowlevel CV tasks such as OF and stereo disparity measurements. Anotherprocessor shown as DSP 1020 performs high level vision and signalprocessing tasks such as object detection, scene understanding andreconstruction, and a systems peripherals block 1030 includes programand data storage 1030 a. System peripherals 1030 interface with sensors,peripheral and communication devices. An interconnect shown as 1022 onthe substrate 105 (e.g., one or more interconnect metal levels) couplesthe GPP 1010, HWA 1015, and DSP 1020 to the systems peripherals block1030 on the substrate 1005.

FIG. 10B depicts implementation details of an example scheme 1090 ofstoring the reference and current picture data in local memories shownas level 1 local memory (L1) and level 2 local memory (L2) within a HWA(such as the HWA 1015 shown in FIG. 10A). This implementation istargeted to minimize the movement of data from the memory (e.g.,Random-Access Memory (RAM)) to the HWA implementing the OF logic. Inthis scheme 1090 both query and reference picture data used to allowproper functioning of the OF estimation process for all paxels in thequery image 1050 is fetched and stored in such a way that it is fetchedonly once from the memory (e.g., RAM). If the query image 1050 andreference image 1060 are the pyramid level images between which the OFis being estimated and the HWA is performing computational tasksinvolved in computing the OF for a paxel 1070, then FIG. 10B illustratesthe extent of image pixel data that needs to be in the local memory. InFIG. 10B the query image is abbreviated QI and the reference image isabbreviated RI.

Considering an embodiment that uses a paxel size of 2×2, block matchingsupport window of 9×9 and the neighborhood size used to compute thecensus signature of 5×5, then the query image data of size that needs tobe available for optical flow computation is a 14×14 neighborhood 1056around the paxel location 1070. Similarly while evaluating each one ofthe predictor or step search positions pixel data from a 13×13neighborhood around those positions is needed. Additionally consideringthe search range of ±191 pixels in horizontal direction and ±63 pixelsin vertical direction a region of rectangular region of 396×140 pixelsin pixel block 1066 around the paxel location 1070 is needed to evaluateall possible ‘valid’ predicted and step search positions along withmodified LK computations as necessary. It is noted that when entirety ofthe pixel block 1066 is not within the picture boundary then appropriatelogic of search range restriction (position being evaluate has to bewithin search range and picture boundary). When the pixel block 1056 and13×13 neighborhood around the search positions are not contained withinthe picture boundary appropriate border extension logic is applied.

As order of processing of the paxels is in raster scan (i.e. left toright and top to bottom), effectively the pixel blocks 1056 and 1066slide over the image in same pattern. Considering the requirement andrandomness of the data accesses in these regions it is advantageous tokeep this data in the local memory inside of the HWA. But if only thesliding block of pixels is stored in the local memory then, whileprocessing the paxel data such as at the paxel location 1075, the entireblock similar to pixel block 1056 or 1066 around that location needs tobe fetched from the memory (e.g., RAM), the result is repeated readingof a large number of pixels massively increasing the bandwidthrequirement, and the compute resources may also have to stay idle whilelocal memory is populated with required data.

It is possible to overcome these limitations and reduce the bandwidthrequirement to fetching a pixel data only once and reusing it acrossneighboring paxels and paxel rows. In this regard the query image datafor entire 14 pixel rows 1053 which are needed to process all paxels inpaxel location 1070 can be fetched in to a partition of on chip localmemory and the HWA can access required set of 14×14 pixel data fromthere. When processing for paxel location 1075 is to begin the pixeldata 1052 can be discarded, the remainder of the pixel data from memoryblock 1053 can be stored (retained) and additional pixel data 1054fetched into local memory. Similarly, the 140 reference image pixel rowson physical memory block 1063 can be stored in another partition of theon chip local memory while processing paxels in the paxel location 1070,and the sliding block of pixels 1066 fetched from there into anothermemory that is still closer to the HWA to allow faster access to thedata.

Again, similar to the memory management done for query image data, whenprocessing for paxel location 1075 is to begin the reference image data1062 (can be discarded, and the remainder of the data from physicalmemory block 1063 can be retained and additional pixel data 1064 fetchedinto local memory for processing of rest of the paxels in that row. Themanner of storing entire rows of pixel data in physical memory blockslike 1053 and 1063 in the local memory can be referred to as growingwindow and storing of sliding block of pixels like 1066 as slidingwindow. In one implementation the local memory used to store the growingwindow of pixel data like 1053, 1063 and the sliding block of pixelslike 1066 can be the logical partitions of a physical memory blocks, inanother implementation they can be stored in more than one physicalmemory blocks, such as pixel data from memory blocks 1053 and 1063 canbe stored in L2 memory and data corresponding to the sliding block ofpixels 1066 can be stored in L1 memory. In another embodiment similarmultilevel local memory architecture and picture data management schemecan be used when a processor is used to implement the OF logic.

Applications for disclosed OF estimation include solving ADAS tasks suchas moving object segmentation, object tracking, time-to-contact(collision) estimation, depth estimation and scene understanding.Disclosed OF estimation enables improved performance ADAS applicationssuch as obstacle detection and collision avoidance auto-emergencybraking as compared to known DOF estimation algorithms.

EXAMPLES

Disclosed embodiments are further illustrated by the following specificExamples, which should not be construed as limiting the scope or contentof this Disclosure in any way.

In comparison to a fully PBM-based DOF algorithm, an example disclosedQP-OFM algorithm evaluates only about 24 search positions per pixel,uses no interpolations and calculates approximately only 1.33 censustransforms per pixel. This represents a large computing saving whenprocessing 60 Million pixels per second which is achieved by adding afraction of the computing of the OF estimate using a modified LK block.The QP-OFM algorithm further simplifies the HWA design by splitting ofcompute operations of a functional block into logical subsets andcombining subset of operations of different sub-blocks into a set suchthat they can reuse the pixel data fetched for operations. One exampleof this type of design is pre-evaluation of the top predictors of apaxel during step search operation of the top pixels. Another example isof pre-computation of spatial and temporal gradients used by LK stepduring step search process. Such re-packaging of the compute tasks ofdifferent functional blocks has led to almost 6 fold reduction of SL2memory complexity compared to the algorithm disclosed in Pub. Pat. App.No. 2015/0365696 for disclosed QP-OFM algorithms.

Those skilled in the art to which this disclosure relates willappreciate that many other embodiments and variations of embodiments arepossible within the scope of the claimed invention, and furtheradditions, deletions, substitutions and modifications may be made to thedescribed embodiments without departing from the scope of thisdisclosure.

The invention claimed is:
 1. An image processing system, comprising: a processor enabled to implement: performing optical flow (OF) estimation between a first frame of video and a second frame of video using a pyramidal block matching (PBM) method to generate an initial optical flow (OF) estimate at a base pyramid level having integer pixel resolution, and refining the initial OF estimate using at least one pass of a modified Lucas-Kanade (LK) method to provide a revised OF estimate having fractional pixel resolution.
 2. The system of claim 1, wherein the PBM method utilizes a hierarchical coarse-to-fine search strategy using predictors and motion smoothness factors, a Binary Census Transform for pixel description, and a Hamming distance as a cost function.
 3. The system of claim 1, wherein the PBM method utilizes a pixel order of processing using a same set of predictors for each entire pixel and a pre-evaluation strategy for the set of predictors.
 4. The system of claim 1, wherein the PBM method uses pre-computation of a spatial gradient and temporal gradient during block matching steps of the PBM method.
 5. The system of claim 1, further comprising OF determination logic implemented in part by a hardware accelerator (HWA), the HWA utilizing a combination of two levels of local memory with one memory for storing picture data from the first frame of video and from the second frame in a growing window fashion, and another memory to store a sliding window of said picture data from the second frame of video.
 6. The system of claim 1, wherein the modified LK method is exclusive of an image warping step and splits computing the revised OF estimate into sub-tasks of spatial and temporal gradient computation and then other operations, uses reduced precision gradient data for OF refinement, and uses only integer arithmetic to perform compute tasks including matrix inversion.
 7. The system of claim 2, further comprising OF determination logic configuring the processor to include in said cost function for each sequential search of the search strategy a motion vector cost value that combines a motion smoothness factor and distance between a median predictor value and a candidate pixel.
 8. The system of claim 1, wherein the at least one pass consists of a single pass which applies the modified LK method at all pyramid levels of a search pyramid used by the PBM method including the base pyramid level after the PBM method concludes.
 9. The system of claim 1, further comprising a post process filter for post processing filtering using pseudo-1d separable median filters to remove impulsive noise from the revised OF estimate.
 10. A method of optical flow (OF) estimation, comprising: using a processor implementing a stored quasi-parametric optical flow measurement (QP-OFM) algorithm which combines a pyramidal block matching (PBM) method with a modified Lucas-Kanade (LK) method, the QP-OFM algorithm performing: optical flow (OF) estimating between a first frame of video and a second frame of video using the PBM method to generate an initial OF estimate at a base pyramid level having integer pixel resolution, and refining the initial OF estimate using at least one pass of the modified LK method to provide a revised OF estimate having fractional pixel resolution.
 11. The method of claim 10, wherein the PBM method utilizes a hierarchical coarse-to-fine search strategy using predictors and motion smoothness factors, a Binary Census Transform for pixel description, and a Hamming distance as a cost function.
 12. The method of claim 10, wherein said PBM method utilizes a paxel order of processing using a same set of predictors for each entire paxel and a pre-evaluation strategy for the set of predictors.
 13. The method of claim 10, wherein the PBM method uses pre-computation of a spatial gradient and temporal gradient during block matching steps of the PBM method.
 14. The method of claim 10, wherein the modified LK method is exclusive of an image warping step and splits computing the revised OF estimate into sub tasks of spatial gradient computation and then other operations, uses reduced precision gradient data for OF refinement, and uses only integer arithmetic to perform compute tasks including matrix inversion.
 15. The method of claim 10, wherein the QP-OFM algorithm is implemented at least in part as a System-on-a-chip (SOC) including a hardware accelerator (HWA), said HWA utilizing a combination of two levels of local memory with one memory for storing picture data from the first frame of video and from the second frame in a growing window fashion, and another memory to store a sliding window of said picture data from the second frame of video.
 16. The method of claim 10, wherein the method is implemented by an image processing system comprising a processor configured for generating a scene analysis.
 17. The method of claim 16, wherein the scene analysis is used by an Advanced Driver Assistance System (ADAS) for obstacle detection or collision avoidance auto-emergency braking.
 18. The method of claim 11, wherein the method includes in the cost function for each sequential search of the search strategy a motion vector cost value that combines a motion smoothness factor and distance between a median predictor value and a candidate pixel.
 19. The method of claim 10, further comprising post processing filtering using pseudo-1d separable median filters to remove impulsive noise from the revised OF estimate.
 20. The method of claim 10, wherein the at least one pass consists of a single pass which applies the modified LK method at all pyramid levels of a search pyramid used by the PBM method including the base pyramid level after the PBM method concludes.
 21. The system of claim 1, wherein the first frame of video is a reference image and the second frame of video is a query image.
 22. The method of claim 10, wherein the first frame of video is a reference image and the second frame of video is a query image.
 23. An image processing system, comprising: circuitry for performing optical flow (OF) estimation between a first frame of video and a second frame of video using a pyramidal block matching (PBM) method to generate an initial optical flow (OF) estimate at a base pyramid level having integer pixel resolution, and circuitry for refining the initial OF estimate using at least one pass of a modified Lucas-Kanade (LK) method to provide a revised OF estimate having fractional pixel resolution.
 24. An image processing system, comprising: means for performing optical flow (OF) estimation between a first frame of video and a second frame of video using a pyramidal block matching (PBM) method to generate an initial optical flow (OF) estimate at a base pyramid level having integer pixel resolution, and means for refining the initial OF estimate using at least one pass of a modified Lucas-Kanade (LK) method to provide a revised OF estimate having fractional pixel resolution. 